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ABSTRACT 

The peak-background split argument is commonly used to relate the abundance of dark 
matter halos to their spatial clustering. Testing this argument requires an accurate 
determination of the halo mass function. Wc present a Maximum Likelihood method 
for fitting parametric functional forms to halo abundances which differs from previous 
work because it does not require binned counts. Our conclusions do not depend on 
whether we use our method or more conventional ones. In addition, halo abundances 
depend on how halos are defined. Our conclusions do not depend on the choice of 
link length associated with the fricnds-of- friends halo-findcr, nor do they change if we 
identify halos using a spherical overdensity algorithm instead. The large scale halo 
bias measured from the matter-halo cross spectrum bx and the halo autocorrelation 
function (on scales k 0.03/iMpc~^ and r 50/i~^Mpc) can differ by as much 
as 5% for halos that are significantly more massive than the characteristic mass M,. 
At these large masses, the peak background split estimate of the linear bias factor bi 
is 3-5% smaller than 6{, which is 5% smaller than bx- We discuss the origin of these 
discrepancies: deterministic nonlinear local bias, with parameters determined by the 
peak-background split argument, is unable to account for the discrepancies we see. A 
simple linear but nonlocal bias model, motivated by peaks theory, may also be difficult 
to reconcile with our measurements. More work on such nonlocal bias models may be 
needed to understand the nature of halo bias at this level of precision. 

Key words: methods: analytical - galaxies: formation - galaxies: haloes - dark matter 
- large scale structure of the universe 



1 INTRODUCTION 

Halo abundances and clustering arc both crucial ingredients 
in the halo model of large scale structure (Peacock & Smith 
2000; Seljak 2000; Scoccimarro et al. 2001; Cooray & Sheth 
2002). However, following Sheth & Tormen (1999), the two 
are not indepedendent: an accurate model of halo cluster- 
ing is part and parcel of an accurate model of halo abun- 
dances. This is because of an argument that has come to be 
called the peak-background split (Bardeen et al. 1986; Cole 
& Kaiser 1989; Mo & White 1996), in which, on large scales, 
perturbed regions of the matter field are treated as though 
they are universes with slightly different mean density and 
Hubble constant (for an explicit calculation, see Martino & 
Sheth 2009). 

As a result, there has been considerable effort to provide 
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simple, accurate and physically motivated functional forms 
for the halo mass function (Press & Schcchtcr 1974; Bond 
et al. 1991; Lee & Shandarin 1998; Sheth et al. 2001), and 
to determine if such models provide adequate descriptions of 
the simulations. When appropriately scaled, the functional 
form predicted by Press & Schcchtcr (1974) is independent 
of power spectrum and cosmology. Sheth & Tormen (1999) 
showed that, although this sort of rescafing of the mass func- 
tion is not expected to hold exactly for the CDM family of 
models, it does produce an approximately universal curve 
in simulations, although the functional form of this univer- 
sal curve is different from that of Press & Schechter (1974). 
Subsequent work has confirmed that the mass function is 
indeed approximately universal (Jenkins et al. 2001; Reed 
et al. 2003), with only the most recent measurements be- 
ginning to detect the expected departures from universality 
(White 2002; Reed et al. 2007; Tinker et al. 2008). This is 
simply because the departures are small so large simulation 
volumes are required to see the effect with high significance. 

The main goal of the present paper is to use the more 
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precise measurements of halo abundances which can now be 
made (in simulations) to perform more precise tests of how 
well the peak background split argument works. We do so 
by measuring halo abundances and clustering in large vol- 
umes, and then comparing the clustering signal with that 
predicted from the measured abundances by the peak back- 
ground split ansatz. We assess the robustness of our re- 
sults by varying how we identify halos in the simulations; 
in each case, we use two different parametrizations for our 
measured abundances, and three different iiK>tliods for fit- 
ting the parametrized models to the measurements. We 
then compare the predicted and measured clustering signals 
in both real and Fourier space, and wc do all this for two 
(and sometimes three) different redsliifts. 

At this level of precision, the comparison of measure- 
ment and prediction is somewhat subtle, because it depends 
on the details of whether or not the bias is expected to be 
deterministic or stochastic, local or nonlocal, linear or non- 
linear, constant or scale-dependent. We study two limiting 
cases in detail: a bias which is deterministic and local in 
configuration space, and is scale independent at linear order 
but contains higher order nonlinear terms, and a bias which 
is deterministic and linear in Fourier space, with no higher 
order terms, but the linear bias is fc-dependent. The former 
arises naturally in the simplest models of halo abundances; 
the latter is motivated by associating nonlinear stuctures 
with peaks in the initial density fluctuation field. 

This paper is organized as follows: Section 2 gives some 
theoretical background and describes a number of ways one 
might have quantified the bias between the halo and mat- 
ter distributions. It then specifies the particular ways we 
have adopted for our test. Section 3 presents measurements 
of halo abundances and clustering in our simulations, and 
comparison with the bias predicted by the peak background 
split argument. A final section summarizes our results and 
conclusions. Appendix A describes a number of ways wc 
have attempted to fit the halo mass function, one of which 
is a new Maximum Likelihood estimator of halo abundances 
that does not require binned counts. Appendix B provides 
explicit expressions for the peak background split bias fac- 
tors associated with our parametrizations of the halo mass 
function. 



2 BACKGROUND 

2.1 Counts in cells and the peak background split 

The peak background split (Bardeen et al. 1986; Cole & 
Kaiser 1989) is an approximation in which the effect of long 
wavelength density perturbations on structure formation is 
simply to modify the collapse times of non-linear objects. 
This modification depends on the density of the perturbed 
region but not on its volume. It is common to state that the 
number density of halos in a perturbed region is expected 
to be the same as that of an unperturbed region, but at 
a slightly different time. However, it is better to think of 
the perturbed number density as being the same as that 
of an unperturbed region in a different background cosmol- 
ogy (after all the density is different), but one that has the 
same age (meaning the effective Hubble constant is differ- 
ent) (Martino & Sheth 2009). When expressed in terms of 



linear theory quantities, this effect changes the critical den- 
sity for non-linear collapse in a way that depends on the 
nonlinear density of the perturbation (Mo &: White 1996). 

Thus, while in general the mean number of halos of mass 
m in a cell depends on its volume V and mass M, in this 
approximation, for cells for cells which are sufficiently large 
that m <C M, the overdensity of halos depends, not on M 
and V, but on M/V = 1 + 5. That is to say, 

(Arh(m, 5c\M, V)} = nh(m, Sc)V [1 + {Sh{m\6))] (1) 

where n(m, 6c) is the average number density of halos with 
mass m, and 

{SH{mm = j:'j^^{s>'-{6''}). (2) 

fc>0 

The coefficients bk{m,5c) come from Taylor expanding 
n(m, 5a — 5) around 5 = 0, and the {5'^) terms are required 
if one wishes to truncate the expansion at finite k but still 
enforce {Sh{'m)\5) = 0. Thus, in this framework, halo bias 
is deterministic [5 is the only random field that determines 
5h) but nonlinear (high order terms in 5 contribute), so it is 
of the form discussed by e.g. Pry & Gaztanaga (1993). 

The most direct check of this assumption is to mea- 
sure the quantity on the left hand side of equation (2) in 
large cells V , and compare with the coefficients one predicts 
from the mass function (Sheth & Lemson 1999; Smith et al. 
2008). Note that this is explicitly a real-space, counts-in- 
ccUs calculation. It is, however, a difficult approach, since 
the halo bias coefficients of interest are those for large cells, 
but these tend to have small variance (the universe is homo- 
geneous on large scales), meaning that there is only a small 
range of 5 over which to measure the shape of the halo bias 
relation. In practice, measuring 62 is tough, and 63 is even 
more challenging. 



2.2 Other measures of the linear bias factor 

A less direct measure of this bias is given by the volume 
average of the cross correlation function between halos and 
mass. In this case, one measures 

l + aL(V) = j dMp{M\V)Y,p(NK\M,V)^^ 

= '[dMpiM\V)'^/J'^' 
J pV rihV 

fc>0 

= 1 + biali + ■■■ (3) 

where cr^^{V) is the cross-correlation between halo and mass 
counts in cells of size V, p{M\V) is the probability a ran- 
domly chosen cell of size V contains mass M, and 

al,^{5l) = lf'^W\kR) (4) 

where P{k) is the power spectrum of the mass, and W is the 
Fourier transform of the smoothing volume {so V oc R^). 
And even more indirect is the second factorial moment 
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of the halo counts-in-ceUs 



J dMp{M\V) ^p{Nh\M,V) 



Nh Nh-1 

nhV rihV 



- I 



,M,iMiv)^.!!!^M^0p}^. (5) 

If the halo counts in cells (M, V) follow a Poisson distribu- 
tion around the mean (Nfi\M, V) (this is a bad assumption 
when m is not small compared to M), then this becomes 



1 + aLiV) 



- I 



dMp{M\V) 



{Nh\M,Vf 
(nhF)2 



1 + &10-M + . . 



(6) 



Finally, it is worth noting that 

dk fe^Phm(fc) 



- I 



k 27r2 

2R 



drr ^hmir) 



W^{kR) (7) 
3 {4 + r/R){2-r/Rf 



where the final expression assumes tophat smoothing. Sim- 
ilar relations hold for a^h, ^hh and Phh- 

So, if bi is independent of scale, then the slope of the 
regression of Sh on 5m is the same quantity as cr^^/cr^ 
and ^hm/Ci a-iid if the counts are Poisson, then this is also 
the same as \/(rliJ(^, \/Chh/^dm, a"hh/o"hm, and ^hh/Chm 
at large scales. In addition, if 61 is independent of scale, 
then the bias in Fourier space quantities is simply related 
to (equal to!) those in configuration space. In particular, 
VPhh(fc)/P(fc), Phh(fc)/Phm(fc) and Ph„,(fc)/P(fc) should all 
equal 61 at low k. But in general, all these quantities are 
diflferent. We discuss some of the differences expected in 
concrete bias models and in view of our measurements be- 
low. 

Even if these bias factors arc equal, actually estimating 
Phh is difficult because the measurement requires a shot- 
noise correction for the discreteness of the halos. Because 
the massive halos of most interest in the present study arc 
rare, this correction can be significant, but because they 
are strongly clustered, this correction is currently uncertain 
(Smith et al. 2008). There is no shot-noise correction for 
Phm, so, in what follows, this is the statistic we will use to 
test the peak background split expression for the linear bias 
parameter bi. We also test the ratio \/5hh/Cdm, for which 
no shot-noise correction is necessary. 

2.3 The effects of nonlinearity on large-scale bias 

Differences between the predicted 61 and the large scale bigis 
measured from correlation functions are expected if the bias 
is nonlinear. Indeed, the peak-background split itself pre- 
dicts that halo bias is not linear (the higher order coefficients 
in equation 2 are generically non-zero) , and such nonlineari- 
ties are seen in numerical simulations (see, e.g., scatter plots 
of 6h vs 5m in Appendix B of Smith et al. 2007). This com- 
plicates interpretation of the measured values of Phm/Pmm 
and \/|hh7^dm as follows. 

In the local bias framework of equation (2), the halo- 
mass cross-correlation reads 

(5w52) = 6i(5i52)+^(<5i<5|) +^(5i<5|) + ... (8) 

where 1 and 2 denote two different spatial positions. In the 



large-scale limit, perturbation theory says that 



where cr\ denotes the variance in the dark matter field when 
smoothed on scale R, and Cpq arc closely related to the 
skewness, kurtosis and so on. E.g., C21 = 68/21 -|- 'yn/S, 
with 7h s dlncr^/dlnJ? and Cpg = CpiCqi (Bernardeau 
1996; Gaztanaga ct al. 2002). Thus, on large scales, the 
cross-correlation bias is 



{Shi52) 



bi + ^ (C21 62 + 63) + ^Csi 63 + . 



(10) 

and it applies equally well in configuration and Fourier 
space. Keeping only the first order corrections to linear 

bias, yields 



Phm{k\R) 

P{k) 



bi+aji 



63 



(11) 



for the Fourier-space quantity (e.g. Smith et al. 2007, who 
neglected the 7^ term), where P;i,„(fe|P) denotes the cross- 
power of the halo and mass fields when both have been 
smoothed with a filter of scale R. 

In the present context, for halos of a given mass, the 
peak-background split argument gives the values of 6, . How- 
ever, the choice of smoothing scale R is less straightforward. 
It must be large enough that the assumptions of a determin- 
istic, scale independent bias are reasonably accurate, so R 
must be substantially larger than the Lagrangian radius of 
the halos (Sheth & Lemson 1999; Smith et al. 2008; Man- 
era & Gaztanaga 2009). But there is no other underlying 
theory for this scale. 

The same logic that led to equation (10) says that 



bi = 



{5hi5h2) J 2 , , 2 , , I, \ , ^2 , 
— - — '- =bi+bi cTfl (C21 b2 + b3) + 



63 



bi-a%^{bs + 2b2Ci2) 



(12) 



The final expression shows that b^ ^ bx even when -ti 1. 
And the ^ term in b^ generates a shot-noise contribution at 
low-A; in the power spectrum. 



2.4 The peaks-bias model 

The previous discussion supposed that the fundamental 
quantity was the bias between halo and mass counts in cells. 
An alternative model is that (high) peaks in the initial den- 
sity field are the seeds around which massive halos form 
(Kaiser 1984). In this case the large scale bias is simplest 
in Fourier space: 



Spk{k) = (6. -I- fcc^^) Wpk{kRpk)S{k), 



(13) 



where Wpk is the smoothing filter with which the peak was 
identified (Matsubara 1999; Desjacques 2008). 

Typically, to approximate halos of mass m by peaks, 
one uses a Gaussian smoothing filter with m oc . In this 
case, a halo of mass m is associated with a peak of height 
= Spk/cro, where 5pk is of order unity as suggested by the 
spherical evolution model, and ctq is given by equation (4) 
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bias symbol 


meaning 


equation 


h, h, h 


First (linear), second and third 
order bias from Taylor expansion 
of the fluctuation in the mass 
density field. This is a determin- 
istic local bias model for which 
predictions exist from the peak 
background split argument in the 
large cell limit. 


(2) 




Large scale bias from the matter- 
halo cross power. Values taken at 
k = 0.03/iMpc-i. 


(11) 


h 


Large scale bias from the corre- 
lation function. Values arc taken 
by averaging f over 40 < r < 
60/i-^Mpc. 


(12) 


bv, 6f 


Linear and quadratic bias from 
the high peaks model. 


(13) 



Table 1. Notation for the various bias factors used in this paper. 



but with smoothing scale Rpk- At high masses, the result- 
ing peak mass function is similar to that of halos (Sheth 
2001). The quantity 6^ oc (cro/cri)^ (yjaQ — b,y), where 
is similar to ctq, but with an extra factor of in the inte- 
gral in equation (4). For a power law power spectrum with 
P{k) oc k", ((To/cti)^ oc m^/^. In the high peak > 1) limit, 
b,^ ^ {u — 3/i^)/o"o so u/ag — b^—> 3/((To/^). In this limit, b^ 
increases as m increases, and {b^/bi,) (ao/ai)^ oc 
j^2/3-(n+3)/3^ a point to which we will return later. 
Equation (13) implies that 

Ppk,s{k) = {b,, + bi;k^)Wpk{kRpi:)PL{k), (14) 
Ppk,pk{k) = (6. + 6cfc')'W^pfc(fci?pfc)PL(fc), (15) 

so Ppk,6 {k)/P{k), sJPpk,pk{k)/P{k) and 

-Ppk.pk {k)/Ppk,s{k) 

all measure the same quantity (even though the quantity 
depends on k]), but the bias relations from correlation func- 
tions or counts in cells will be more complicated (because 
of the k dependence). In particular, notice that, in contrast 
to the previous model, here the linear bias factor itself is 
scale-dependent . 

Now, the bias relations above are for peaks identified in 
the initial fluctuation field. At this time bi from the peak 
background split calculation equals b^, from the Fourier bias 
calculation (Desjacques & Sheth 2009). (In principle, at 
least for peaks, this agreement can be used as a guide to the 
appropriate shot-noise correction for Ppk,pk{k) - like mas- 
sive halos, high peaks are rare, so the shot-noise correction 
matters - but this is beyond the scope of this paper.) A 
peak background split estimate for the late time bias pa- 
rameters bi, 62, etc. of peaks was made by Mo et al. (1997). 
This estimate says that 61 —> 1 -I- foi (with similar conse- 
quences for 62 etc.), and is in reasonable agreement with 

measurements in simulations of \J <^'^k,pk/ '^^ \/£,pk,pk/(. 
(Mo et al. 1997) (i.e., within the accuracy of what was pos- 
sible with the smaller simulation volumes of 10 years ago). 
This suggests that bv evolves as 61, but a good model for 
the evolution of fej is still not available. Therefore, when we 
compare the peaks model with measurements in simultions, 
we will simply consider if a k^ scaling of the bias factor seems 
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Figure 1. Mass function at 2 = (upper set of curves) and 
z = 0.5 (lower set of curves) for three linking lengths in simula- 
tions: 0.15 (fewest massive halos), 0.168 and 0.2 (most massive 
halos). Lines show equation (19) with parameters from our new 
Maximum Likelihood estimator (see Table 2). 



appropriate, and if the onset of this term occurs at smaller 
k for halos of higher masses. 



3 MEASUREMENTS IN SIMULATIONS 
3.1 Description of the simulations 

For our analysis we use 49 cosmological dark matter simula- 
tions of a flat ACDM cosmology with = 0.27, 57a = 0.73, 
fib = 0.046, as = 0.9, h = 0.72 and Us = 1.0. Each simu- 
lation was run using periodic boundary conditions in a box 
of size Lijox = 1280/i~^Mpc, which contains 640^ particles. 
This gives a particle mass of Mp ~ 6 X lO"/i"^M0. AU 49 
runs have the same parameters except for the random seeds 
used to generate the initical conditions. Therefore they can 
be considered as different realizations (or parts) of the same 
universe; this allows us to estimate errors on the mass func- 
tion and bias factors we measure in the next section. For ref- 
erence, the total volume sampled by our runs is Vt — W2h~^ 
Gpc^ 

One potentially important difference from almost all 
previous work in which volumes of this size have been stud- 
ied is in how we generate our initial conditions. These are 
set at z = 50 by using CMBFAST (Seljak & Zaldarriaga 
1996) to generate the Transfer function for the initial matter 
power spectrum. We then use a Second Order Lagrangian 
Perturbation Theory (2LPT) code (Scoccimarro 1998) to 
generate the initial displacement field. The use of 2LPT 
initial conditions ensures that spurious transient effects in 
the simulations are negligible at low redshifts (Crocce et al. 
2006). The tree-PM code Gadget-2 (Springel 2005), with 
a softening length set to 20/i~'^kpc, is then used to simulate 
the subsequent evolution. 
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Figure 2. Same as figure 1, only now, to better see the range 
on the plot, the mass functions have been divided by a fiducial 
function (equation 19 with p = 0.33 and q = 0.75). Error bars 
show the rms variation between simulations. 

3.2 The halo mass function 

We have run a standard friends-of-friends (FoF) code to 
identify dark matter halos in the simulations at redshifts 
z = and z — 0.5. The halo mass function one obtains 
depends on the one free parameter of the FOF algorithm: 
the linking length. Shorter linking lengths return lower mass 
halos. Since halo abundances and clustering strength are 
intimately related, the choice of linking length also affects 
the halo bias parameters. To address this, we have explored 
three choices: ln^k = 0.15,0.168 and 0.2 (in units of the in- 
terparticle separation) . 

The halo mass of each object found by the FoF algo- 
rithm was determined from the number of particles A'^ it 
contains, corrected for discreteness effects following Warren 
et al. (2006). Thus, Mh = Mp A^corroctcd, where A^corrcctcd = 
A'^(l — A'^"" ''). This correction has been tested only for FoF 
halos with lunk = 0.2, and may sligtly overcorrect the mass 
for smaller linking lengths. Since in this paper we are fitting 
the mass function for halos having more than 105 particles, 
these differences are negligible for the large mass halos which 
are of most interest in what follows. 

It is common to use the same linking length for all red- 
shifts. However, the natural outcome of the spherical col- 
lapse model predicts that, in ACDM models, halos are a 
larger multiple of the background density at late times. If 
this model is correct, then one expects the appropriate link 
length to be approximately constant at early times, and to 
decrease at late times. Our choices of linking-length approx- 
imately bracket the expected range of densities. 

Another popular choice for identifying halos is to re- 
quire them to be a fixed multiple of the critical density. In 
ACDM models, this has the virtue of being well-motivated 
at early times (when the background cosmology is effectively 
Einstein-de Sitter, so the background and critical densities 
are equal) as well as at very late times (when the critical 
density has become constant). In section 3.8 we use halos 
identified using a spherical overdensity method by Tinker 
et al. (2008). However, in this case, the overdensity was a 



Figure 3. Same as Figure 1, but now shown in scaled units, so 
outputs from 2: = 0, 0.5 and 1 are shown together. Because we 
only count halos with more than 105 particles, the lower redshift 
output probes to smaller u, and the higher redshift output to 
higher u. Results for the three linking lengths are shown: 0.15, 
0.168 and 0.2. For a fixed u larger Zunj^ yields more halos. 




ln(v) 

Figure 4. Same as Figure 2, but now in scaled units. Error bars 
show the error on the mean value between simulations. 

fixed multiple (200) of the background density. We find that 
the main results which follow are robust to which halo finder 
we use. 

Figure 1 shows the mass functions associated with the 
three linking lengths at z = and 2: = 0.5. To emphasize 
detailed differences, we show this same information divided 
by a fiducial model for halo abundances in Figure 2. The 
fiducial model is that of equation (19) below, with p — 0.75 
and q — 0.33. In these, as in all the plots to follow, the bins 
are 0.05 dex in mass, and error bars, unless stated otherwise, 
show the rms variation between simulations. The true error 
on the mean is a factor of -\/49 = 7 smaller. It is interesting 
to ask if the halo catalog returned by a shorter link-length 
is essentially a higher redshift version of the halo catalog 
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associated with the longer link-length. We will have more 
to say about this shortly, but note that this dependence on 
linking length is not naturally included in models of halo 
abundances (e.g. Sheth et al. 2001). 

When the masses are suitably rescaled, the mass func- 
tion can be expressed in a functional form that is nearly 
universal - being approximately independent of time, cos- 
mology, and initial power spectrum (Sheth & Tormen 1999). 
The spherical evolution model suggests that the natural scal- 
ing variable should be 



(16) 



where 5sc is the critical density required for spherical collapse 
in a cosmology with parameters (SI2, A^), D{z) is the linear 
theory growth factor in units of its value at z = [e.g. 
D{z) = (1 2)-i and 5sc(«) = 1.686 if (n,,A,) = (1,0)], 
and 



dk fc^Po(fe) 



k 27r2 



W^kRrr 



(17) 



with m = p(47r_R^/3) and W{x) — (sina; — xcosx). 

Here Po{k) denotes the initial power spectrum of fluctua- 
tions, scaled using linear theory to 2; = 0, and p is the co- 
moving background density. 

So, one measure of the best link-length is to see which 
one provides the most universal scaling. Figure 3 shows the 
mass functions in these scaled units, 1/, and Figure 4, shows 
these curves divided by the same fiducial model as before. 
Because we only have a fixed mass range in the simulations, 
the higher redshift outputs mainly probe the v ^ 1 end of 
the mass function. Therefore, in these figures, we also show 
results for z = 1. 

It is not obvious that any one link length produces more 
self-similar scalings than the others. What is more appar- 
ent is that, whatever the link-length, the z — abundances 
appear to be offset to slightly larger values compared to 
those at higher z. This is in qualitative agreement with 
the spherical model, which predicts that halos should be in- 
creasingly dense relative to the background at late times, 
meaning that the appropriate link length should be smaller 
at late times. By using a fixed link length, we will overesti- 
mate halo masses, and hence the abundance at large v. 

A slight variation on the appropriate self-similar scaling 
is to ignore the z dependence of Ssc- Although this has no 
physical motivation, it is a popular choice (e.g. Jenkins et al. 
2001; Reed et al. 2003; Warren et al. 2006). We have found 
that this makes the mass function slightly less universal (the 
offset at 2 = is slightly more pronounced), but since we 
are not scaling the link-lengths with time in the way the 
spherical model suggests, we do not think our measurements 
advocate strongly for including the ^-dependence of S^c- 



3.3 Fitting the mass function 

We fit the halo catalog to a given parametric model of the 
halo mass function in three ways, and we do this for the 
functional forms given by Sheth & Tormen (1999) and War- 
ren et al. (2006). In both cases 



m dn(m) dlnm 
p dlnm d\nv 



(18) 



mn 
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LoglM] 



Figure 5. Ratio of variance of halo counts between runs to mean 
halo count for a number of bins in mass. For each mass bin, error 
bars show the error on the mean between the six measurements of 
this ratio (the three link lengths at each of two redshift bins). If 
the counts were Poisson, this ratio would be unity, with a typical 
spread of about 0.2 (see text in section 3.6). 



The first case has 



-qu/2) (19) 



where Ap = [1 + 2"^ r(l/2 - p)/r(l/2)]"^ is chosen so that 
the integral of / over all 1/ is unity. This functional form has 
two free parameters, {q,p). The second. 



i^fwiv) = A [1 -f 6(cz/)~"j exp(-cj//2). 



(20) 



has four free parameters, because there is no requirement 
that the integral over all v equal unity (indeed, it diverges!). 

Of our three fitting methods two are standard and one 
is new. The two standard methods compare the theoretical 
model with a binned halo mass function, and both assume 
Poisson counts in a bin. But, whereas one approach com- 
putes a simple chi-square of the difference between the ex- 
pected and measured counts in bins (e.g. Jenkins et al. 2001; 
Reed et al. 2007), the other uses a Maximum Likelihood ap- 
proach (Warren et al. 2006). These methods are slightly less 
than ideal, because there is some art in choosing the size 
of the bin. In the Appendix, we describe our new method, 
which is a Maximum Likelihood estimator that does not 
work with binned counts. 

Since the Poisson assumption is an important ingredient 
in the first two methods (our new method makes an equiva- 
lent assumption) , it is important to check if this assumption 
is accurate. Figure 5 shows the ratio of the variance between 
runs to the mean count (determined by averaging over all 
the runs) in each bin. If the counts are truly Poisson, then 
this ratio should be unity, with a typical spread of about 
^2/(N — 1), where A'^ is the number of runs from which the 
mean and variance were estimated (this assumes A'^ ^ 1 is 
large). The Figure shows that the Poisson assumption is 
good, although there is a hint that the variance drops below 
the Poisson value for the most massive halos. 

To minimize systematic effects due to the finite mass 



Large scale bias 7 



Method: New ML method Poisson ML method method New ML method Poisson ML method method 



z 


'link 


q 


P 


q 


P 


q 


P 


rms(q) 


rms(p) 


rms(q) 


rms(p) 


rms(q) 


rms(p) 


0.0 
0.0 
0.0 


0.15 

0.168 
0.2 


0.82 
0.773 
0.709 


0.289 
0.272 
0.248 


0.805 
0.756 
0.689 


0.297 
0.282 
0.26 


0.803 
0.753 
0.687 


0.298 
0.284 
0.261 


0.008 
0.008 
0.007 


0.004 
0.004 
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0.003 
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0.003 
0.003 
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0.5 
0.5 
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0.724 
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0.836 
0.784 
0.714 


0.293 
0.276 
0.251 


0.833 
0.785 
0.708 


0.296 
0.275 
0.257 


0.01 
0.009 
0.008 


0.006 
0.006 
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0.007 
0.006 
0.006 


0.004 
0.004 
0.004 


0.007 
0.006 
0.006 


0.004 
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0.004 



Table 2. Best fit parameters from three ways of fitting equation (19) to the halo abundances in the simulations, and the rms dispersion 
between the 49 simulations. 



Sheth and Tormen Fit 




Warren Fit 



z = 0.5 




log(m) 



Figure 6. Mass functions when the link length is 0.2, divided 
by a fiducial curve; three curves show fits to equation 19 and 20 
returned by our three algorithms: X'^"fit (green), Poisson ML 
fit (red), and new ML fit (blue). Error bars show rms between 
variation between simulations. 



resolution of the simulation we only fit the mass func- 
tion for halos with more than 105 particles: i.e., M ~ 
6.310^^/i~^Mq. For the two fitting methods that require 
binned counts, the bin widths were 0.05 dex, except for the 
highest mass bin, which was enlarged to include at least 80 
halos (in most cases this last bin contains more than 200 



halos). For each bin, the rms of the 49 simulations was used 
as a weight when performing the chi-square fit. Figure 6 
shows the results; all three estimators return similar fits to 
the measurements. 

In practice, when fitting to equation (19), the best-fit p 
and q values vary little from one simulation to another, so if 
one averages p and q over the 49 runs, then the mass function 
associated with these averaged values is a good description 
of the average measured mass function. Table 2 shows the 
mean and rms dispersion of p and q, derived from averaging 
the best fit values for each of the 49 simulations. 

The uncertainties in p and q are correlated. We argue in 
the Appendix that this may be understood, at least for our 
new estimator, in terms of the mass fraction that is predicted 
to lie above our minimum mass threshold (following Sheth 
et al. 2003). This quantity is very well measured in each 
simulation and, for the case of equation (19), this means 
that the best fit p and q are expected to lie along a simple 
well-defined curve, and they do.) 

Reporting our results of fitting to equation (20) is less 
straightforward. This is because this functional form has 
four free parameters, so two other measured quantities are 
required for tracking correlation between parameters. The 
most natural candidates are the mean and mean square mass 
of the halos that are above threshold. These constraints 
give rise to a complicated set of islands in parameter space, 
thus compromising any attempt to describe the uncertainty 
range on the best fit parameters in terms of simple lower 
and upper limits. (I.e., if one rises slightly above the level 
of the global minimum, one includes many other local min- 
ima.) In this case, the curves we show are for the parameters 
obtained by combining the halo catalogs from all the indi- 
vidual simulations, and then performing the fit. Figure A2 
illustrates. Notice that the parameter c is rather well con- 
strained, whereas the other two are not. This is because 
we are essentially only fitting the high mass end, where the 
counts are falling exponentially and the parameters a and b 
matter little. Indeed, whereas the various best-fit parameter 
combinations all produce essentially the same counts at the 
lowest masses we probe, they differ (slightly) only at high 
masses. 

Before concluding this section, it is worth noting that, 
for a given link-length, the value of p changes little with 
z. In contrast, for a fixed z, the value of p decreases sys- 
tematically as /link increases, suggesting that the intuitively 
appealing notion of the set of particles linked together by 
longer link-lengths at an earlier time being the same as the 
set linked together by a shorter link-length at a later time, 
is not correct in detail. 



8 M. Manera, R. K. Sheth, R. Scoccimarro 





Figure 7. Halo-mass bias from cross power spectra. Left panels sliow results at z = 0; rigtit panels at z = 0.5. From top to bottom, 
linking lengths are 0.15, 0.168 and 0.2. Error bars show rms variation between simulations. Black solid lines are fits to the k dependence 
of bias between k = [0.006, 0.2] for the highest mass bins and k = [0.006, 0.3] for the other mass bins. 
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Table 3. Peak-background split bias factors (Appendix B gives explicit expressions) with the free parameters p and q obtained from 
using our new ML method to fit the halo abundances to equation (19) (see Table 2). 
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Table 4. Large-scale bias for three bins in halo mass. Halo masses 
are m units of 10"h-iMQ. The bias was measured from the halo- 
mass cross spectrum at fe = 0.03 h/Mpc, while the parameters 
bu and are a fit to the scale dependence of the bias between 



k = [0.006, 0.2] for the high mass bin and k ■■ 
other two mass bins. 



[0.006, 0.3] for the 



3.4 Halo-mass cross power-spectra 

For the reasons discussed earlier, we have measured the halo- 
mass cross power spectra for all our halo catalogs, and so 
obtained the large scale bias for different halo mass bins. 

Figure 7 shows the ratio of Phm to the power spectrum 
of the mass at a = for three bins in halo mass. The three 
panels show results for the three linking lengths. In all cases, 
for k below 0.05/i~^Mpc, the bias is approximately indepen- 
dent of k. (The strong fc-depcndence at larger k is consistent 
with previous work, e.g., Sheth & Tormen 1999). This large 
scale bias is largest for the halo catalog from the shortest 
linking length. This is not surprising, since the bias is ex- 
pected to increase with halo mass, and a halo of a given 
mass with this length will only be more massive when the 
link length is longer. Thus, for example, halos at the high 
end of the middle mass bin may have been in the larger mass 
bin when the link length was longer. Their stronger cluster- 
ing increases the bias for the small link-length catalogs. 

If we had found that the longer link- length halo catalogs 
from am earlier time were essentially the same as the shorter 
link-length catalogs at a later time, then we would be able to 



use the continuity equation to relate the bias of the high-« 
long-Ziink objects to the bias of the \aw-z short-Zunk objects. 
Although not exact, this should still give a good qualitative 
idea of the bias: {bz — 1) = {bo — 1){D(,/ D^) so, for b\ > 1, 
we expect the high-z sample to have a larger bias factor. 



3.5 Relation to peaks bias 

In view of our discussion of peaks bias, we have fitted our 
measurements to functions of the form bv + bi;k^. These pa- 
rameters are reported in Table 4 together with the value of 
the bias at k — 0.03/i^^Mpc and its rms error. In most cases, 
the quadratic form is not a good fit to the fc-dependent bias 
at fe > 0.2/iMpc~^ - the /s-dependence is weaker. However, 
Table 4 shows that the amplitude of the quadratic piece in- 
creases rapidly as m increases, in qualitative agreement with 
expectations. 

We have found that the radii Rpk required to match the 
values of bi, and b^ in the large scale i/ limit (equation 38 in 
Desjacques (2008)) are about 8 — 9/i~^Mpc for the largest 
mass bin, and smaller for the other bins. These radii arc 
comparable to the initial Lagrangian radii of the halos, so 
they are not unreasonable. However, to see if the scaling 
with mass is quantitatively correct, we should account more 
carefully for how the range in halo masses maps to that in 
peak smoothing scales, as well as for the effects of nonlinear 
evolution on bi, and 6^. This is beyond the scope of our 
paper. 



3.6 Comparison with predicted large-scale bias 

We are now in a position to compare the measured large 
scale bias factor with that predicted from fitting the mass 
function and applying the peak background split to estimate 
6i. The peak-background split prediction is 



bi = l- 
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so bi associated with equations (19) and (20) is 
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The thick solid lines in Figure 8 show the measurement, 
f'hm/f'mm at = 0.03 /iMpc~^. The thickness of the lines 
shows the two-cr range for the measurement, i.e., two times 
the error on the mean value. Each triple of symbols shows 
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Figure 8. Comparison of measured large scale bias factor (thick 
solid line) with the predicted bi of equation (22), for the same 
three bins in halo mass shown in the previous figure (higher 
masses have larger bias factors). The parameters p and q of bi 
are obtained from fitting the mass function to equation (19). For 
each mass bin, the three symbols with error bars show the predic- 
tions associated with our three ways of fitting the mass function; 
the error bars show the scatter in the bias between the 49 simula- 
tions, divided by \/49. Upper panel shows results at 2 = 0, lower 
panel at 2 = 0.5. 

the predicted bias (fei of equation 22) associated with our 
three ways of fitting the mass function to equation (19). 
Clearly, they give similar results. The error bars show the 
scatter in the predicted peak-background split bias between 
the 49 simulations (i.e., we use the best fit p and q obtained 
from fitting the halo abundances in a simultion to predict 
its bi ; the scatter in p and q between simulations translates 
into scatter in fei). The upper and lower panels show results 
at z = and z — 0.5 respectively. 

The differences between the measurements and the pre- 
dicted values of bi are statistically significant, especially for 
masses which are large compared to M* . Figure 9 shows that 
this is not due to the parametric form assumed for the halo 
mass function: fitting to equation (20) and using the asso- 
ciated expression for 6i (equation 23), yields similar results. 
(There is one obvious difference: at high masses, the uncer- 
tainty on the predicted bi is similar to that associated with 
equation 22, but at lower masses, the uncertainty associated 



Figure 9. Same as previous figure, but now bi is from equa- 
tion (23), with parameters from fitting the mass function to equa- 
tion (20). 



with equation 23 is substantially larger. This is because, at 
high masses, both formulae for bi are sensitive only to the 
scale of the exponential cut-off in halo counts, which is de- 
termined by the parameters q and c respectively. At lower 
masses, the other parameters also matter, of which there are 
more for equations 20 and 23 than for equations 19 and 22.) 
We find qualitatively similar effects for all our choices of 

^link- 

What should we make of the discrepancy between the 
measured large scale bias and &i at high masses? Following 
the discussion of Section 2.3, such differences are not unex- 
pected, because the peak-background split bias relation is 
nonlinear. As a result, the expected large scale bias factor 
bx depends on the higher order bias parameters 62 and 63 
as well as 61 (see equation 11). Like 61, these also depend on 
halo mass, and the parametrization of the halo mass func- 
tion. Explicit formulae are provided in Appendix B, and 
Table 3 provides the numerical values associated with the 
fits to equation (19). 

Unfortunately, the expected difference depends on a 
smoothing scale R for which we have no underlying the- 
ory. On the other hand, equation (10) shows that we expect 
bx ~ bi for our lower mass bins, but that bx > 61 at very 
large masses, in qualitative agreement with our measure- 
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Figure 10. Configuration space estimate of halo bias, ^J(,hh/ £,dmt for the same mass bins as in previous Figures, when Iw^]^ 
z = 0.5 (left) and z = (right). Error bars show the error on the mean value betweeen simulations. 
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ments. (For lower masses than we are studying here, we ex- 
pect 6x < &i.) Therefore, we have treated 7? as a free param- 
eter, to allow equation (11) for fex to fit as well as possible. 
The predicted difference between fax and 6i which results 
sometimes has the wrong sign, because 63 can be large and 
negative (see Table 3). The differences at large masses are 
qualitatively consistent with our measurements if we ignore 
higher order terms in and we set 63 — 0, although there 
is no theoretical justification for either of these steps. And if 
we do this, then we are unable to match the measurements 
at lower masses. Thus, while equation (11) can sometimes 
account qualitatively for the differences seen in Figures 8 
and 9 (62 and 63 are both negative in the low-mass limit), it 
cannot account in detail for the observed differences. This 
suggests that the deterministic nonlinear local bias model 
does not provide a sufficiently accurate description of halo 
bias. 



3.7 Comparison with bias from configuration 
space 

So far we have been measuring the large scale bias from 
simulations in Fourier space using Phm- But one can also 
measure it in configuration space from the correlation func- 
tion 5hh/C<2m- Figure 10 shows y/^hh/^dm for the same three 
halo mass bins when lunk = 0.2. Error bars show the error 
on the mean value between simulations. A constant bias is 
a good description of the measurement on scales between 
25 - 75fe"^Mpc. The average value of this ratio, computed 
between r — [40, 60]/i^^Mpc, is shown by the solid horizon- 
tal lines. At scales close to the acoustic peak (105/i~^Mpc 
for our cosmological model) the bias has some scale depen- 
dence, particularly for the highest mass halos, which we dis- 
cuss shortly. 

Figure 11 compares the Fourier space measurement of 
Phm/Pmm (bars ou the left of each panel), with the mean 
and dispersion of \/£,hh/£,dm (thick solid bars on right of each 
panel). (Recall that, for each simulation, these ratios are 
averaged over the range r = [40, 60]/i"^Mpc.) The widths of 



the bars show the 2cr error on the mean measured bias (i.e, 
the rms dispersion times 2/\/49), indicating that these two 
measures of the bias are slightly but significantly different 
for the highest mass bin. Each pair of error bars shows the 
two peak background split predictions for 61 (equations 22 
and 23, and recall that the latter has substantially larger 
uncertainties) for each of the three methods we use when 
fitting the mass function (from left to right, these are New 
ML, Poisson ML, x^-method). Notice that the predictions 
are closer to the configuration space measurement than the 
other one, but the difference is still significant. 

Unfortunately, it is not straightforward to compare 
of equation (12) with our measurements, because the theory 
calculation is for the correlation function of the smoothed 
halo field (divided by that of the similarly smoothed mass 
field), whereas our measurements of ^hh and £^dm are made 
on the unsmoothed point distributions. Nevertheless, be- 
cause we measure 6j < &x, and this is qualitatively consis- 
tent with equation (12), we might ask what effective smooth- 
ing radius is required to explain the difference. For our large 
mass bins, this radius is of order R ~ 40/i~^Mpc. However, 
although this would make foj = 6x, it does not explain the 
magnitude of the difference from &i . 

We noted that the halo bias has some scale dependence 
around the acoustic peak scale (105/i~^Mpc for our cosmo- 
logical model). This scale dependent halo bias is consistent 
with the trends reported in (Smith et al. 2007; Smith et al. 
2008) that have since been confirmed by a number of au- 
thors (Sanchez et al. 2008; Sanchez et al. 2009; Kim et al. 
2008). 



3.8 Halos from spherical overdensity 

It is well known that some objects identified by a Friends of 
Friends algorithm may have dumb-bell like shapes. In this 
case, the algorithm labels as a single massive object what 
might better be classified as two separate objects of smaller 
mass. This changes how the abundance and the clustering 
depend on mass, so one might wonder if some of the discrep- 
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Figure 11. Comparison of large scale bias estimates for the same 
halo mass bins as in previous figures when 1^^]^ = 0.2. Thick 
bars show the measured Phm/Pmm (left) and \/^hh/Smm (right), 
and symbols with error bars show the linear bias parameter bi 
predicted from the peak background split. 

ancy with the peak-background split predictions we find can 
be attributed to our choice of group-finder. 

In this section, we perform the same analysis as before, 
but now using halos identified with a spherical overdensity 
(SO) requirement. Halos were identified as spherical re- 
gions, each 200 times denser than the background, in the 
z = outputs of our simulations by J. Tinker following 
standard methods. We compute the abundance, cross-power 
bias 6x, and autocorrelation bias for three bins in halo 
mass. Whereas the two higher mass bins are the same as be- 
fore, the lowest mass bin is slightly different, due to details 
of how the halo finder was run. Results for these measured 
bias factors are shown as bars in Figure 12, together with the 
peak-background split prediction from their mass function 
(black dots with error bars). These show that bx is about 5% 
larger than 65, which is itself larger than the peak-baground 
split prediction. These are in the same sense, and have the 
same magnitude as our previous results based on FoF halos 
(Figure 11). As an extra test we have computed 61 for the 
higher mass bin by fiting the mass funcion of SO halos only 
in the mass bin range instead of the wider range available. 
In this case the difference between 61 and 65 got reduced to 
half, but it remains still significant. We conclude that our 
finding that 61 7^ 6x 7^ does not depend on how halos 
were identified. 



4 DISCUSSION AND CONCLUSIONS 

The peak-background split argument is commonly used to 
relate the abundances of dark matter halos to their spatial 
clustering. We have found that this estimate of the bias be- 
tween halos and the dark matter is not accurate to better 
than ~ 10 percent when compared with different measures 
of large scale bias, particularly for the most massive halos. 
We did not test the intermediate or low mass regime. 

Our results are insensitive to a) how exactly we define 



Figure 12. Same as previous figure, but now for halos identified 
using an SO algorithm. Results are shown for the same mass bins 
as before, except that the lowest mass bin is from S.gS-lO^^M© to 
7.0- IO-'^'^Mq . Thick bars show the measured Phm/Pmm (left) and 
\/ihh/^mm (right); the thickness of the bars indicates the two-o- 
range. Symbols with error bars show the linear bias parameter bi 
predicted from fitting equation (19) to the halo abundances using 
the Poisson and methods. Error bars show the rms scatter 
between realizations. 

halos, b) the exact functional form of the mass function and 
c) how the mass function was fitted. We have checked this by 
exploring three friends-of-friends linking lengths for defining 
the halo catalogs, 0.15, 0.168 and 0.2 (see Figures 1-4), as 
well as using a spherical overdensity criterion (Section 3.8); 
two functional forms for the mass function (equations 19 
and 20, for which the associated linear bias factors bi are 
given by equations 22 and 23); and three methods for fitting 
halo counts to these functional forms, one of which is new. 
The latter is a likelihood estimator that maximizes the prob- 
ability that a randomly chosen particle belongs to a halo of 
specified mass; it does not require binned halo counts, thus 
removing the arbitrariness of the choice of bin size which is 
intrinsic to more standard methods. 

We have also studied the self-similarity of the mass func- 
tion at different linking lengths for z = 0, 0.5, 1 and find that 
it is qualitatively but not exactly self-similar (see Figure 4). 
We have argued that this difference may be reduced by scal- 
ing the linking-length as a function of redshift as suggested 
by the spherical collapse model. 

Results for the different estimates of large-scale halo 
bias are shown for two different redshifts in Figures 8 and 9. 
Although halo bias appears to be close to linear on large 
scales (Figures 7 and 10), the bias factor 65 = \/£,hh/(,dm 
one measures at large r is different from bx = Phm/Pmm 
measured at small k, and both are different from the peak- 
background split estimate of the linear bias factor 61, at 
large masses where &i > 2 (Figures 11 and 12). On the 
other hand, at lower masses where b\ « 2, &i « fox ~ to 
within a few percent. 

We discussed possible explanations for the differences 
at large masses. For example, the contribution of nonlin- 
ear bias terms, 62, 63, etc., which are generic to the peak- 
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quantitatively for all masses. As one alternative, we consid- 
ered a peaks-bias model which is linear but nonlocal and 
scale dependent in fe-space. More work is needed before a 
fair quantitative comparison of this model with the measure- 
ments can be made, but our measurements suggest qualita- 
tive agreement. Another, which we are pursuing, is to study 
models in which the evolution between initial and evolved 
fields (e.g., equation 2) is no longer a deterministic function 
of the overdensity. 

Finally, wc note that our expression for the bias factor 
implicitly assumes that the mass function has a universal 
form. The fact that it is not quite universal will modify the 
bias factor predicted by the peak-background split (Sheth 
& Tormen 1999), although work in progress suggests this is 
not enough to explain the discrepancies we have found. 
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Figure 13. Dependence of 6x (equation 10, with terms of order 
(7|j and higher set to zero) on the smoothing parameter cr^ , when 
the bias factors bi , 62 and 63 axe given by equation (B4) with 
(P) 9) = (0.25,0.7). Solid curve shows the linear bias parameter 
61, which corresponds to the cr^ — ♦ limit of 6x ■ 

background split argument (we provide explicit expressions 
in Appendix B), make 61 7^ &x 7^ &4 (see equations 10 
and 12). However, the amplitude of these corrections de- 
pends on a parameter, cr^, for which there is no underly- 
ing theory, other than the expectation that it is smaller 
than unity, but greater than zero. While nonlinear terms 
could explain the difference between 65 and b^, the differ- 
ences between these bias factors and 61 are consistent with 
our measurements only if we ignore terms of order a% and 
higher, and wc set 63 = 0, although there is no theoretical 
justification for either of these steps. But then, to be self- 
consistent, we should use the same algorithm for the lower 
mass bins, and there, what (barely) worked for the high 
masses no longer works (because 62 and 63 arc negative). 

Although our analysis was restricted to massive halos, 
it is likely that our conclusions about the (in)accuracy of the 
peak background split extend to lower masses. To illustrate. 
Figure 13 shows how the predicted bx differs from the lin- 
ear bias factor fei, for a number of choices of the unknown 
parameter an. (To make the plot, we have ignored terms of 
order cr^ and higher in equation 10.) Note that the differ- 
ence between bx and 61 is not simple: at high masses where 
61 > 2, &x > bi, whereas the opposite is true at intermediate 
masses, and bx ~ bi at very low masses. In recent simula- 
tions which resolve smaller halos (e.g., Boylan-Kolchin et al. 
2009), the measured large scale bias is indeed smaller than 
61, in qualitative agreement with Figure 13. However, com- 
parison with Fig. 10 of Boylan-Kolchin ct al. (2009) shows 
that, at the 10% level, the quantitative agreement is not 
good. 

We conclude that more work is needed to understand 
the nature of halo bias at the few percent level. Our re- 
sults suggest that wc arc beginning to see the limitations 
of the local deterministic bias model - while the inclusion 
of higher order bias terms can sometimes explain the qual- 
itative difference between 61, bx and b^, it does not work 
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APPENDIX A: FITTING THE HALO MASS 
FUNCTION 

This Appendix defines a Maximum likelihood estimator of 
the halo mass function that does not require binned halo 
counts. The key is to think about the mass function in 
exactly the same way that theorists do when modeling it. 
Namely, the question is not: How many halos are there in 
a certain mass bin in the simulation box? but, What is the 
probability that a randomly chosen particle in the simula- 
tion box was in a halo of mass m? 

Let dn(m) dm denote the number density of haloes of 



mass m. Then the fraction of particles in such haloes is 

m dn(m) 

f(m)am= — -. (Al) 

^ ^ p dm ^ ^ 

Let f{m\0)dm denote a theoretical model of this quantity, 
where denotes the vector of parameters which specifies the 
model. Then the likelihood to be maximized is 

JVp 

£{0) = Yl.f(rm\e), (A2) 

i = l 

where the product is over all Np particles in the simulation 
box. In practice, one only measures halos down to some 
minimum mass. This modifies the estimator above to 

Np 

£(6») = F(m < M^inl^j^p-^-s^min Ylf{mi\0), (A3) 

i=l 

where 

F{m < Mmin\0) = 1 ~ drnf{rn\0), (A4) 

and Nm>M^in is the total number of particles in halos above 
the minimum mass. We have explicitly written this as unity 
minus the integral over massive halos to allow for the possi- 
bility that bound halos below some mass scale may not exist 
(and because some authors choose functional forms which 
lead to divergences when integrated over all m). This way 
of writing the probability shows that it is trivial to account 
for this possibility. 

Now, because one has found the halos, one need not 
draw from the particle list when computing the likelihood, 
one can use the (considerably smaller!) halo catalog instead. 
I.e., 

£(0) = F(m < M^i„|0)^''-^-S"»in l[f{mi\0f', (A5) 

where the product is now over the A^^ halos in the box, iVj 
is the number of particles in halo i, and 

i = l 

The derivatives of In £{0) with respect to the parameters 
6i can be done analytically, so this method is fast. The sec- 
ond derivatives provide analytic estimates of shape of the 
likelihood surface near the minimum, and hence of the un- 
certainties on the best-fit parameters. 

In practice, the mass functions of current interest are 
written in terms of the scaled variable u. Therefore, wo scale 
all masses m to using equation (16), and then write the 
likelihood in these scaled variables before maximizing: 

C(0) = F{v < ,.^i„|0)^p-'^">M„.n llf{,.i\0f\ (A7) 

It is straightforward but tedious to compute the first and 
second derivatives with respect to the parameters 0. Doing 
so gives an idea of the expected accuracy of and covariances 
between the best-fitting parameters. However, a more intu- 
itive demonstration of the covariances can be got by noting 
that, for large Afmin, the vast majority of particles in the 
simulation are not assigned to halos, and so the line of de- 
generacy is driven by requiring that the model always pro- 
duce the observed mass fraction in halos. For example, when 
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fitting to equation (19), the parameters p and q must cliange 
so as to keep Ap [r(l/2,gfmin/2,oo)/r(l/2) + 2-fr(l/2 - 
p, gt'min/2, cx))/r(l/2)] fixed. Tile solid line in Figure Al 
shows this curve for halos of mass M > 6.31 X lO"/i"^M0 
identified with li^-ik = 0.2 at z = 0, at which time the mass 
fraction in halos is 0.13 (this is the mean over all 49 simula- 
tions; the actual fraction varies slightly from one realization 
to another). Symbols show the best fit parameters for each 
of the 49 simulations. 

Figure A2 shows a similar comparison of the measured 
covariances between best fit parameters of equation (20). 
We have not shown the expected correlations for this case. 



APPENDIX B: BIAS FACTORS 

In the peak background split ansatz, one writes the halo 
fiuctuation 5h as a power series of the mass fluctuation: 



Figure A2. Top: Measured 2 = halo abundances (link length 
0.2) when the 49 simulations have been combined. Error bars show 
the rms variation between simulations. Curves show the result 
of fitting equation (20) to the counts using the three methods 
described in the main text. All methods return essentially the 
same counts at the lowest u we probe; they differ slightly at higher 
u. Bottom: Covariance between best-fit parameters for each of 
the 49 simulations with linking- length 0.2 and redshift z = 0. The 
fractional error on c is much smaller than on the other parameters. 
Stars, crosses and tripods show results for the ML, Poisson and 
methods: there is no systematic trend with fitting method. 
Filled solid circles show the parameters associated with fitting to 
the combined counts. 



(Bl) 



and one obtains the coefficients bi by taking appropriate 
derivatives of the halo mass function, and accounting for the 
fact that halo abundances are estimated in the initial field 
So rather than the evolved field 5 (Mo & White 1996; Mo 
et al. 1997; Sheth & Tormen 1999). Namely, one assumes 
there is a deterministic mapping between So and S: 



So = E cii5\ 



(B2) 



and that this mapping is given by the spherical evolution 
model 

1 17 341 55805 ,^„, 

ai = 1, a2 — , as = , and 04 = . (B3) 

21' 567 130977 ^ ' 



Then, 
&i(z.) 



1 + ei + £1 

2(l + a2)(ei + Si) + 62 +-E2 
6(a2 + a:5)(ei + £1) + 3(1 + 2a2)(e2 + £2) 
+ £3 + E-i (B4) 
24(a3 + a4)(ei-h£i) + 
+ 12(a^ + 2(a2 + a-i)){€2 + E2) + 
-1-4(1 -f 3a2)(e3 -|- E3) + 64 + ^4 



16 M. Manera, R. K. Sheth, R. Scoccimarro 



where 



63 = 



£4 



qv — 1 <li'{qi' — 3) 

qv (<fv^ — 6qv + 3) 

q^v'' (9^1/2 - IQqv + 15) . 



Or., 



E3 4p2 + 6gz/p + 3g^i/^ - 6qv - 1 ^2 

El 

E4 _ 2 (V + {8qiy + 4^ + (6g^i^^ - 6qv - l) p) 
El ~ 

2{2q^u^ - 9gV^ + qv - 1) 

for the mass function of equation (19) (Scoccimarro et al. 
2001). 

For the functional form of equation (20), 
_ cv _ ci>{cv — 1) _ c^v'^(cv — 3) 

(c^v'^ - &CV + 3) 4 

£4 = Oc, 

_ 2ab _E2 ^ 2a + 2ci^ + 1 

^ 5c(«')- + 65.' El Sc ' ^ ' 

E3 _ '^c? + ^cva + 6a + Zc^v^ + 2 
^ ~ M ' 

?4 
Si 



S4 _ 2 (4a^ + 4(2ci/ + 3)a^ + (6cV^ + 6ci/ + ll) o) 3 



2(2c^t^^ - 2,(?v^ + ci^ + 3) 
+ Jl ■ 

We note that the assumption of equation (B2) is strong, 
and only an approximation in triaxial collapse models (Ohta 
ot al. 2004; Lam & Shoth 2009). Accounting for this is the 
subject of ongoing work. 



